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I. Introduction 

Markov population models (MPMs) are continuous-time 
Markov chains (CTMCs) that describe the dynamics and 
interactions of different populations. They have important 
applications in the life science domain, in particular in ecology, 
epidemics, and biochemistry. Depending on the system under 
study, a member of a population represents an individual that 
belongs to a certain biological species, an organism that suffers 
from an infectious disease, or a certain type of molecule in 
a living cell. Thus, if n is the number of populations, the 
state space of the MPM is N n , that is, a state is a vector 
(x\, . . . ,x n ) of non-negative integers, where the entry x» is 
the size of the i-th population. Typically, the transitions of an 
MPM are described by a finite set of transition classes such 
that each class specifies a (possibly infinite) number of edges 
in the underlying transition graph. For instance, we may have 
one class to represent the death of individuals. In biochemistry, 
each chemical reaction describes a class of transitions in the 
associated MPM. Often, the corresponding transition rates are 
state-dependent, e.g., the rate at which individuals of a certain 
population die may depend on the population size. 

The structural regularity of MPMs often enables accurate 
approximations of the system behavior. One such example is 
the widely-used deterministic approximation of the dynamics 
of chemical reaction networks [7| that represents the states 
as a continuum. But if one or more populations are small a 
discrete representation of the population sizes is important and 
continuous approximations are inaccurate. This effect has also 
been observed experimentally in the context of chemical reac- 
tions 1T3], 021, 01j). In such cases the analysis of the MPM 
becomes difficult. Closed-form solutions are only possible in 
special circumstances and numerical solution techniques 
suffer from the problem that a very large or even infinite state 
space has to be explored. Therefore, Monte-Carlo simulation is 
in widespread use to estimate transient or stationary measures 
of the MPM. 

Recently, progress has been made on numerically approxi- 
mating the transient distribution of an MPM at particular time 
instances Q, Q, ifTTl . These approaches exploit the fact that 
only a subset of the state space is needed to give an accurate 
approximation and that only a small amount of probability 
mass is located above a certain population threshold. The 
intuitive explanation for this is simple since within a fixed 
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time interval it is extremely unlikely that the populations reach 
certain thresholds. For the long-run behavior of the system, 
however, this argument does not hold since it is a priori not 
known if and where the system stabilizes. 

In this paper, we consider the problem of computing an 
accurate approximation of the equilibrium distribution of an 
MPM. Assuming that the MPM is ergodic, we first derive 
geometric bounds for the equilibrium distribution, i.e., we 
find those regions of the state space, where most of the 
probability mass is located in the limit. Then we perform 
a local refinement for these regions in order to bound the 
probabilities of individual states. 

Let {X(t).t > 0} be an MPM. The first step relies on an 
analysis of a drift function d that associates with each state 
x the expected change d(x) = j^E\g(X(t)) \ X(t) = x). 
Here, g is a function that associates with each state x a 
non-negative real number, also called Lyapunov function. We 
illustrate this by means of an example. Figure [T] shows the plot 
of the equilibrium distribution of an MPM that describes a 
gene regulatory network. The system is bistable, that is, in the 
long-run the probability mass is concentrated at two distinct 
regions in the state space. In these regions the drift d(x) is 
maximal. We determine geometric bounds (here depicted as 
dashed lines) by using a simple threshold on the drift, i.e., we 
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Fig. 1. The equilibrium distribution n of an MPM that describes a gene 
regulatory network and geometric bounds enclosing a subset C of the state 
space with X] c eC 7r ( c ) > 0-9- 



consider the set C of all states where the drift is greater than a 
certain threshold. Then C contains those states, where most of 
the probability mass is located in equilibrium. For the amount 
of probability within C, we derive tight bounds. 

In the second step of our approach, we consider the 
stochastic complement of C [9| which allows us to derive 
bounds for the probabilities of individual states. The stochastic 
complement is a finite CTMC with state space C, where each 
outgoing transition leading to a state not in C is redirected 
to C. This redirection is defined in such a way that the 
equilibrium distribution of the stochastic complement gives the 
conditional equilibrium probabilities for the states of C. Since 
the exact redirection probability can only be obtained from a 
full solution of the infinite system, we consider candidates that 
give upper and lower bounds. Together with the first step of the 
approach, this yields bounds for the equilibrium probabilities 
of all states of the MPM X. 

The full paper version will contain all relevant proof details. 

II. Markov Population Models 

We consider a class of time-homogeneous CTMCs that can 
be described by a finite set of transition classes. A transition 
class r is a pair (a,v), where a : N n —> M>o is a function 
that determines the transition rate and v E Z™ is a change 
vector that determines the successor state of the transition. 
Thus, if x E N n and a(x) > then there is a transition 
from state x to state x + v with rate a(x). We assume 
that v has at least one nonzero entry and that a(x) is a 
polynomial in x = (xi, . . . , x n ). Let {n, . . . , r^.} be a set of 
transition classes with distinct change vectors. Let Q be the 
infinite matrix such that the entry Q(x, x + Vj) equals atj(x), 
where Tj — (ctj,Vj) and j E {1, ...,k}. If we define the 
diagonal entries of Q as the negative sum of the off-diagonal 
entries, then Q is the infinitesimal generator of a CTMC 
{X(t),t > 0}. The matrix Q has only finitely many nonzero 
entries in each row and in each column, i.e., Q is an infinite 
matrix with finite rows/columns. Note that sup^ |Q(x, x)\ may 
be infinite and that the number of states reachable from a given 
initial state may be infinite. 

III. Geometric Bounds 

We assume that t\ , . . . , Tk are such that X is ergodic and 
7r is the equilibrium distribution of X. Then there exists a 
function g* : N™ — > M. + , a finite subset C, and a constant 
7 > such that OH 

(i) f t E[g*(X(t)) | X(t) = x}< - 7 , Vx E N n \C, 

(u) f t E[g*(X(t)) | X(t) = x] < oo, Vx E C, (1) 

(Hi) {x E N™ | g*(x) < £} is finite for any £ < oo. 

Let c be a positive number with c > max^ge E[g* (X (t)) \ 
X(t) = x] and let g(x) = 9 ^ . In the sequel, we will refer to 
g as the Lyapunov function. The first two conditions in Eq. (|1J) 
are now equivalent to 



where \c ( x ) — 1 if x & C and otherwise. Note that if we 
multiply Eq. Q with ir(x) and sum over x, the left-hand side 
becomes zero and we arrive at 
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Thus, we can use Eq. |2]i to bound the probability mass outside 
of C. 

For a given Lyapunov function g, we define the drift d(x) 
in state x as d(x) = ^E[g(X(t))\X(t) = x\. Since X has a 
transition class description, 

k 

d ( x ) = J2 a i( x ^ g ( x+v ^~ g ( x ^- 

If g is a polynomial of degree t g > and the highest degree of 
the rate functions ay is Id then the drift function d is at most a 
polynomial of degree (£ g — l)£d- For most population models, 
id < 2. Moreover, often a degree of £ g — 2 is sufficient for g. 
In these cases, we can easily determine the maxima of d and 
use Eq. |2| to derive bounds for the probability mass outside 
of C. Note that we can also bound the probability mass inside 
C with symmetric arguments O. 

Example 1 We consider a gene regulatory network called the 
exclusive switch ^8y. It consists of two genes with promotor 
regions. Each of the two gene products P\ and Pi inhibits 
the expression of the other product if a molecule is bound to 
the respective promotor region. More precisely, if the promotor 
region is free, molecules of both types P\ and Pi are produced. 
If a molecule of type P\ is bound to the promotor region of Pi, 
only molecules of type P\ are produced. If a molecule of type 
Pi is bound to the promotor region of P\, only molecules of 
type Pi are produced. No other configuration of the promotor 
region exists. The system has six chemical species of which 
two have an infinite range, namely P± and Pi. We define the 
transition classes Tj — (ctj, Vj), j E {1, . . . , 8} as follows. 

• For j E {1,2} we describe production of Pj by Vj(x) = 
ej and otj(x) = 0.05x2+^. Here, Xi+j the number of 
active genes that produce Pj, which is either zero or one. 
The j-th entry of a state x represents the number of Pj 
molecules and the vector ej is such that all its entries 
are zero except the j-th entry which is one. 

• We describe degradation of Pj by Vj +2 — —Cj and 
a,j+i(x) — Q.QQbxj. Here, Xj denotes the number of Pj 
molecules. 

• We model the binding of P\ to the promotor (which 
inhibits the gene that is responsible for the production 
of Pi) as V5 = — ei — e4 + e 6 , and a 5 (x) — 0.1x1X3X4. 
Here, xq is one if a molecule of type P\ is bound to the 
promotor region and zero otherwise. Note that a^ is zero 
in all states where the promotor is not free (X3 — or 
Xi = 0). 

• We model the binding of Pi to the promotor (which 
inhibits the gene that is responsible for the production 
of Pi) as vq — —62 — £3 + e^, and a§(x) = O.IX2X3X4. 



Here, X5 is one if a molecule of type P 2 is bound to the 
promotor region and zero otherwise. 

• For unbinding of P\ we define v-; — e\ + e^ — e^, and 
a 7 (x) =0.005:r 6 . 

• For unbinding of P 2 we define v$ — e 2 + ej, — e§, and 
a 8 (x) = 0.005a; 5 . 

We use the Lyapunov function q given by g(x) — x\ + x\ + 

4- 



x\. Consequently, the drift becomes 



d(x) = 0.05x 3 (2a;i + 1) + 0.05x 4 (2x 2 + 1) 

+ O.IX3X4X1 (—2x4 — 2xi + 2x5 + 3) 

+ O.IX3X4X2 (—2x 3 — 2x 2 + 2x 6 + 3) 

+ 0.005xi(-2.X! + 1) + 0.005x 2 (-2x 2 + 1) 

+ 0.005x5 (-2x 5 + 2x 3 + 2x 2 + 3) 

+ 0.005x 6 (-2x 6 + 2x 4 + 2xi + 3). 

With the initial condition Xj+ 2 = L invariantly it holds 
that Xj + 2 G {0, 1} and Xj+% = 1 — Xj+4. The global 
maxima of d(x) (when considering real valued Xj) therefore 
are found at x( ml > = (0.25,5.75,0,1,1,0) and x( m2 ) = 
(5.25,0.25,0,1,1,0). The maximal value of the drift in the 
reachable part of the state space consequently is lower or 
equal to 

c = d(x {ml) ) = d(x< m2 >) = 0.38625. 

We are interested in a set C with X^ec n ( x ) > 1 — e> where 
e G (0, 1) is an a priori chosen threshold. Let 7 > be such 
that e = c/(c + 7). We choose C such that is contains all 
states where the drift is greater than 7. Note that C is finite 
since g fulfills condition Hi) in Eq. ([TJ. It is easy to verify that 
Eq. |2]l holds if we scale the Lyapunov function g by 7 + C, 
that is, g(x) — (x\ + x\ + . . . + XeJ/il + c )- We retrieve the 
normalized drift 



d s (x) = 



7 + c 



d(x). 



Therefore, C = {1 £ N 6 d s (x) > e — 1} and we get the 
desired bound for the equilibrium probability inside C. With 
the constraints on Xj and Xj+ 2 we only have to consider the 
bounds for x\ and x 2 and derive three cases, namely: 

1) di(x) = d s (xx,X2,l,0,0,l) = -0.01.x 2 - O.lxl + 
0.115x1 + 0.005x2 + 0.055 > e - 1, 

2) d 2 (x) = d s (xi.x 2 ,0,l,l,0) = -O.Olx 2 - O.lx 2 + 
0.005xi + 0.115x2 + 0.055 > e - 1, 



3) d 3 (x) = 4(xi,x 2 , 1,1,0, 0) 
0.205(xi +x 2 )+0.1 >e-l, 
illustrated by FigureU\for e = 0.1. 
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In order to apply the approach described above, one has to 
find a Lyapunov function g such that Eq. |2]) holds for some 
c and 7. This may become difficult for complex systems even 
though often a quadratic function is sufficient and it is possible 
to optimize the coefficients. 



IV. Conditional Probabilities 

In order to derive probability bounds for the individual states 
in C, we consider the following partitioning 
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Q[c,c] 

Q[C,C] 



Q[C,C] 
Q[C,C] 



of the generator matrix Q into blocks that describe the 
transitions within C, to C, within C and back to C. Let E 
denote the embedded Markov chain of Q, i.e., 



E = I Dq X Q = 



E[C,C] 
E[C,C] 



E[C,C[ 
E[C,C] 



where I denotes the identity matrix and Dq is a diagonal 
matrix such that Dq(x,x) = — Q(x,x) for all states x. 
The stochastic complement of C is defined as 



00 



Q = Q[C,C] + Q[C,C] ^(E[C,C]) l E[C,C]. 

Using the fact that X is ergodic, we are able to show that 
Q is well-defined and is the infinitesimal generator of a finite 
ergodic CTMC. _ 

Let ttc denote the equilibrium distribution of Q. For finite 
discrete-time Markov chains, it has been shown in the seminal 
work by Meyer J9] that the entries of ttq are equal to the 
conditional equilibrium probabilities of X, i.e., irc(x) = 
7r(x)/^ ceC 7r(c) for all x. In what follows, we extend this 
result to infinite MPMs. 

The construction of the stochastic complement requires that 
transition probabilities inside the infinite set C are calculated. 
Since this is infeasible for MPMs, we apply a similar technique 
as proposed by Courtois and Semal for finite CTMCs|l|, 
(2) • The idea is to only consider the set C and redirect the 
transitions that lead from C to C back to C. The matrix Q 
contains the "exact" redirections of the transitions, i.e., the 
solution of the corresponding CTMC gives the conditional 
probabilities of the states in C. Since we cannot construct Q, 
we redirect the transitions in such a way that we obtain upper 
and lower bounds. 

We first consider the substochastic matrix W given by 

W = I+^Q[C,C] 
A 

with A > maxjcgc — Q[C,C](x,x). If we increase the j-th 
column of W such that it becomes a stochastic matrix, it 
is easy to see that the result represents an ergodic discrete- 
time Markov chain for j = 1, . . . , \C\. When computing the 
conditional probabilities for relatively small values of \C\, one 
can pass the slack probability mass summing up the rows of 
W to 1 in an extra column to a dummy state and add an 
extra row corresponding to the dummy state which redirects 
the system to state j with probability 1. After removing the 
redundant last equation, the transposed linear system can be 
written such that W T — J is the coefficient matrix and ej 
is the right-hand side vector. This makes it possible to LU 
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Fig. 2. Difference between upper and lower bounds on states in C for 
e= 0.1. 



factorize W T — I only once, and obtain the solution by 
forward and backward substitutions followed by normalization 
for j = 1, . . . , \C\, Now, let tTj be the associated equilibrium 
distribution. From this, we are able to derive that for all x E C 

W/ \ ^ 7T(XJ w . . 

mm7r (x) < = --^- < max7T (x), 

For a given threshold e > 0, we first determine the set 
C as described in Section [Ell] Then we bound the individual 
(unconditioned) state probability of a state x E C by 



(1 — e)min7r (x) < it(x) < max7r (x). 
i i 

Example 2 For Example U\ we received tight bounds on the 
individual conditioned equilibrium probabilities inside C for 
e = 0.1 (cf. Fig. 17). In Fig. H\we plot the difference between 
upper and lower bounds for all states. We achived a precision 
of 8 — 3.5 • 10~ 4 , i.e., the maximal difference between 
upper and lower bound was 5. Note that a lower bound 
can be retrieved by multiplying the distribution of Fig. [7J by 
1— e = 0.9. The upper bound ofiv{x) is given by max-,- ■k^(x). 
There is a total of 1671 states in C for the chosen value of e. 
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V. Conclusion 

We have demonstrated how to calculate equilibrium proba- 
bility bounds of infinite MPM by combining Lyapunov theory 
with numerical approximation and bounding techniques. Much 
remains to be done with respect to implementation efficiency, 
since various time-space tradeoffs appear worthwhile to be 
explored. 



